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ABSTRACT 

Designing  and  controlling  simple  collective  robot  behaviors 
often  requires  complex  range  and  bearing  sensors  and  peer- 
to-peer  communication  strategies.  Recent  work  studying 
swarms  robots  that  have  no  computational  power  has  shown 
that  complex  behaviors  such  as  aggregation  and  clustering 
items  can  be  produced  from  extremely  simple  control  poli¬ 
cies  and  sensing  capability.  We  extend  previous  work  on 
computation-free  swarm  behaviors  and  show  that  it  is  pos¬ 
sible  to  evolve  simple  control  policies  to  form  a  perimeter 
around  a  target,  rendezvous  to  a  specific  location,  and  per¬ 
form  foraging.  We  also  demonstrate  that  simple  manipula¬ 
tions  of  the  environment  provide  a  form  of  stigmergic  con¬ 
trol,  whereby  these  collective  behavior  can  be  controlled. 
The  robustness  and  expressiveness  of  these  behaviors,  com¬ 
bined  with  the  simple  requirements  for  control  and  sensing, 
demonstrate  the  feasibility  of  implementing  swarm  behav¬ 
iors  at  small  scales  or  in  extreme  environments. 
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telligence — multiagent  systems,  coherence  and  coordination ; 
1.2.9  [Artificial  Intelligence]:  Robotics — intelligent  vehi¬ 
cles 

Keywords 

swarm  robotics,  evolutionary  algorithms,  computation-free 
robot,  controlling  collective  behaviors 

1.  INTRODUCTION 

Flocks  of  birds,  schools  of  fish,  and  colonies  of  ants,  bees, 
termites  exhibit  a  remarkable  robustness  and  resilience,  de¬ 
spite  the  limited  capabilities  of  each  individual.  Recently 
research  into  bio-inspired  swarm  robotics  has  been  gaining 
popularity  due  to  the  low-cost,  robust,  redundant,  and  dis¬ 
tributed  nature  of  swarms  [3].  Potential  applications  for 
robot  swarms  include,  search  and  rescue,  construction,  and 
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chemical  spill  clean-up,  as  well  as  nano-medical  applica¬ 
tions  such  as  finding  tumors  [10].  Many  of  these  applica¬ 
tions  would  benefit  from  simple,  cheap,  disposable  swarms 
of  robots  that  can  accomplish  these  tasks  quickly  and  with¬ 
out  much  human  supervision. 

While  there  has  been  a  lot  of  work  on  different  swarm¬ 
ing  algorithms  and  technologies,  many  still  require  localiza¬ 
tion,  mapping,  complex  coordination  algorithms,  and  pre¬ 
cise  identification  of  neighboring  robots’  orientations  and 
relative  positions.  This  often  results  in  swarm  behaviors 
that  are  interesting  but  extremely  difficult  to  implement 
on  actual  robotic  platforms.  For  swarm  applications  in  the 
nano-medical  field,  developing  collective  behaviors  that  use 
extremely  simple  controllers  and  sensors  is  especially  impor¬ 
tant  if  these  behaviors  have  an  hope  of  being  implemented 
on  nano-robots[ll]. 

Recently,  Gauci  et  al.  have  shown  that  swarms  of  robots 
so  dumb  that  they  have  no  computational  power-they  can’t 
even  add  or  subtract,  and  have  no  memory  can  still  collec¬ 
tively  solve  canonical  multi-robot  problems  such  as  aggre¬ 
gation  [6],  and  simple  object  clustering [5].  There  are  sev¬ 
eral  key  benefits  to  researching  the  capabilities  of  extremely 
dumb  robot  swarms:  (l)the  dumber  the  robot,  the  cheaper 
and  more  disposable  it  is,  (2)  the  simpler  the  control  algo¬ 
rithm  the  easier  to  implement  on  real  robots  [8,  6],  and  (3) 
even  teams  of  smart  robots  may  need  a  “Plan  B”  consisting 
of  simple  robust  algorithms  that  require  only  the  most  basic 
capabilities  in  case  of  malfunctions  and  failure. 

We  extend  the  work  of  Gauci  et  al.  by  showing  that  many 
other  interesting  behaviors  can  be  achieved  using  swarms  of 
computation-free  robots.  Our  work  starts  with  the  simple 
robot  model  proposed  in  [6]  and  adds  a  form  of  stigmergic 
control  by  changing  the  environment  to  expand  the  possi¬ 
ble  behaviors  and  control  collective  behaviors.  We  investi¬ 
gate  what  behaviors  are  possible  given  limited  control  over 
the  placement  of  a  small  number  of  objects  in  the  swarm’s 
environment.  We  use  a  genetic  algorithm  approach  to  de¬ 
sign  these  swarm  behaviors  by  first  defining  a  fitness  func¬ 
tion  that  describes  a  desired  collective  behavior,  and  then 
searching  the  space  of  simple  controllers  that  best  achieve 
this  behavior.  This  approach  allows  us  to  evolve  succes¬ 
sively  better  controllers  using  a  robot  simulator  to  evaluate 
potential  controllers.  We  present  successful  results  on  three 
behaviors  that  are  possible  using  very  simple  sensors  and 
controllers:  forming  a  perimeter,  rendezvous,  and  foraging. 
We  additionally  show  that  simple  manipulations  of  the  en¬ 
vironment  allow  these  behaviors  to  be  controlled. 
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2.  RELATED  WORK 

Trianni  et  al.  evolved  a  neural  network  based  controller 
that  performs  aggregation  using  swarms  of  S-bots  [13] .  How¬ 
ever,  each  S-bot  uses  eight  infrared  proximity  sensors,  three 
microphones,  three  sensors  for  detecting  connections  on  the 
body  and  a  gripper  sensor.  Baldassarre  et  al.  evolved  a 
controller  that  aggregates  a  group  of  robots  and  then  moves 
them  towards  a  light  source  [1].  Their  controller  utilizes  a 
neural  network  that  takes  in  eight  infrared  proximity  sensors 
readings,  four  directional  light  source  sensors  readings,  and 
four  directional  sound  sensors  readings  as  control  inputs. 
Gauci  et  al.  introduced  the  concept  of  robots  that  can’t 
compute  [4] .  They  showed  that  a  simple  reactive  controller 
could  be  used  to  allow  a  swarm  of  computation-free  robots 
with  a  single  line-of-sight  sensor  to  perform  aggregation  [6] 
and  clustering  [5]. 

Other  work  has  looked  at  controlling  collective  behaviors. 
Rubenstein  et  al.  [12]  studied  how  to  collectively  transport 
items  using  a  simple  control  signals  and  behaviors.  Others 
have  looked  at  controlling  more  complex  collective  transport 
problems  [2]  or  using  termite-inspired  stigmergic  control  to 
build  complex  structures  [14].  However,  none  of  this  work 
considers  the  extreme  conditions  of  a  single  line-of-sight  sen¬ 
sor  and  zero  computation. 

3.  PROBLEM  FORMULATION 

This  paper  investigates  what  collective  behaviors  are  pos¬ 
sible  given  a  swarm  of  extremely  simple  robots  operating 
in  a  simple  environment.  For  our  experiments  we  consider 
a  circularly  bounded  2D  environment  that  is  homogeneous 
and  contains  no  obstacles.  Throughout  this  space  n  circular 
robots  are  randomly  distributed  and  randomly  rotated  such 
that  each  robot  faces  a  random  direction.  Robots  learn  to 
interact  with  immovable  targets,  movable  objects,  and  other 
robots  to  achieve  global  behaviors.  All  entities  (robots,  tar¬ 
gets,  and  objects)  are  rigid  such  that  no  two  entities  can 
occupy  the  space  at  the  same  time. 

We  define  simple  robots  as  agents  that  are  memoryless, 
cannot  perform  computations  and  have  limited  input/output 
capabilities.  Specifically  we  look  at  robots  that  are  only 
equipped  with  a  line  of  sight  sensor  and  two  wheels  for  dif¬ 
ferential  drive.  The  line  of  sight  sensor  can  only  detect  the 
presence  or  absence  of  objects  and  outputs  a  tr inary  value 
where  s  =  2  corresponds  to  a  target  or  object  in  line  of 
sight,  s  =  1  corresponds  to  a  robot  in  line  of  sight,  and 
s  =  0  corresponds  to  nothing  in  line  of  sight. 

Simple  robots  are  reactive  in  nature  because  they  cannot 
remember  past  input  or  actions.  As  a  result  simple  robot 
controllers  can  be  to  a  sequential  series  of  if-statements  that 
assign  left  and  right  wheel  velocities  based  on  current  sensor 
readings.  This  controller  can  be  represented  as  a  set  of  six 
wheel  velocities 

V  =  [vio,VrO,Vii,Vrl,Vi2,Vr2\ 

where  vio/vro  are  the  velocities  of  the  left/right  wheel  when 
there  is  no  robot  in  the  sensors  current  line  of  sight,  vn/vri 
are  the  respective  velocities  when  a  robot  is  within  the  line  of 
sight  of  the  sensor  and  vi2/vr2  are  the  respective  velocities 
when  a  target  or  object  is  within  the  line  of  sight  of  the 
sensor.  Velocities  are  normalized  such  that  v  =  [—1,1]  where 
1  corresponds  to  a  wheel  spinning  forward  at  full  speed  and 
-1  corresponds  to  a  wheel  spinning  backwards  at  full  speed. 


if  (s=0){ 

vL  =  ... 
vR  =  ... 

} 

if  (s  =  1){ 

vL  =  ... 
vR  =... 

> 

if  (s=2){ 

vL  =  ... 
vR  =  ... 

} 

Figure  1:  Robot  representation  and  corresponding 
controller. 

4.  BEHAVIORS 

This  section  explores  several  global  behaviors  learned  us¬ 
ing  evolutionary  optimization  techniques.  We  discover  global 
behaviors  by  optimizing  a  universal  robotic  controller  ac¬ 
cording  to  a  behavior  dependent  fitness  function.  Each  po¬ 
tential  robot  controller  is  evaluated  by  running  a  swarm  sim¬ 
ulation  and  calculating  the  fitness  function  at  every  time 
step  to  generate  a  fitness  score,  which  is  given  by 

T—l 

U(V)=^tu(t) 

t= 0 

where  T  is  the  number  of  time  steps  in  the  simulation  and 
u(t)  is  the  fitness  function.  Multiplying  the  fitness  function 
by  the  time  step  rewards  controllers  that  achieve  desired 
behavior  quickly.  The  robot  controllers  are  optimized  using 
the  average  fitness  score  over  multiple  simulations  to  reduce 
the  effect  of  noise. 

All  simulations  are  run  on  the  Enki  2.0  robot  simulator, 
which  is  able  simulate  hundreds  of  robots  in  a  2D  environ¬ 
ment  in  faster  than  real  time  [9].  For  our  experiments  the 
simulation  physics  are  updated  100  times  per  second  and  the 
robot  controller  is  updated  10  times  per  second.  Robots  are 
simulated  using  Enki’s  Epuck  model  which  have  a  diameter 
of  7.4  cm,  inter- wheel  distance  of  5.1  cm,  and  weight  of  152 
g.  Targets  and  objects  are  simulated  as  cylinders  with  a  di¬ 
ameter  of  10  cm  using  Enki’s  physical  object  model.  Objects 
have  a  mass  of  35g  and  a  coefficient  of  friction  of  0.58.  Tar¬ 
gets  have  a  sufficiently  large  mass  and  coefficient  of  friction 
to  ensure  that  are  immobile. 

Robot  controllers  are  optimized  using  the  Covariance  Ma¬ 
trix  Adaption  Evolution  Strategy  (CMA-ES)  [7].  This  ge¬ 
netic  optimization  technique  uses  the  variance  of  each  gene 
to  generate  mutations  between  generation.  Earlier  work  by 
Gauci  et  al.  has  shown  that  CMA-ES  can  effectively  op¬ 
timize  simple  robotic  controllers  [5].  CMA-ES  optimizes 
across  all  real  numbers,  which  can  result  in  genes  out  of 
normalized  range.  To  avoid  this  we  constrain  genes  by  ap¬ 
plying  the  following  sigmoid  function 

l-e~x 

v  =  - - 

1  +  e~x 

where  x  is  a  gene  optimized  by  CMA-ES.  For  our  experi¬ 
ments  we  utilized  the  following  CMA-ES  parameters:  pop¬ 
ulation  size  of  13,  initial  step  size,  <r(0)  =  0.72,  and  starting 
controller  of  V  =  [0,  0,  0,  0,  0,  0]. 

4.1  Aggregating  to  a  Target 

We  first  investigate  what  is  possible  when  a  single  station¬ 
ary  target  is  placed  in  the  environment.  In  this  behavior 
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(a)  (b)  (c) 


(d)  (e)  (f) 


(a)  (b)  (c) 


* 

(d)  (e) 


Figure  2:  Perimeter  formation  around  a  dynamic 
target. 

robots  are  initially  distributed  randomly  throughout  an  en¬ 
vironment  and  over  time  form  a  group  around  a  randomly 
positioned  target.  Our  fitness  function  rewards  global  be¬ 
haviors  that  minimize  the  total  distance  between  each  robot 
and  the  target.  Let  pi  ( t )  represent  the  position  of  robot  i  at 
time  step  t  and  ptarget  represent  the  position  of  the  target. 
Then  the  rendezvous  fitness  function  is  given  by 

77,-1 

^rendezvous  (t')  —  -E  II Pity)  Ptarget  || 

i= 0 

where  ||-||  is  the  Euclidean  norm.  This  fitness  function  re¬ 
wards  solutions  where  the  robots  are  close  to  the  objects 
locations. 

4.1.1  Perimeter  Formation 

The  resulting  controller  that  is  evolved  forms  a  perimeter 
around  the  target.  The  evolved  controller  is 

V  =  [1.0, 0.37, 1.0, 1.0,  -1.0, 0.83]. 

This  solution  results  in  the  robots  aggregating  to  the  tar¬ 
get  and  forming  a  perimeter  around  the  target.  We  also 
experimented  with  changing  the  location  of  the  target  mid¬ 
simulation.  The  results  are  shown  in  Figure  2.  The  robots 
converge  to  the  target  and  form  a  perimeter.  When  the  tar¬ 
get  is  placed  in  a  new  location,  the  entire  swarm  quickly 
moves  to  the  new  location  and  reforms  the  perimeter.  This 
behavior  is  very  robust  and  is  automatic — the  robots  are 
purely  reactive  so  they  can  be  controlled  simply  by  changing 
the  environment,  removing  the  need  to  broadcast  informa¬ 
tion  to  the  swarm  or  have  additional  control  logic. 

4.1.2  Rendezvous 

We  are  also  interested  in  having  every  member  of  the 
swarm  gather  as  close  as  possible  to  the  target,  rather  than 
just  circle  around  it.  Rendezvous  is  an  important  behavior 
for  swarms  because  it  sets  the  stage  for  more  complicated  be¬ 
haviors  by  assembling  a  group  of  robots  to  a  specific  desired 
location.  We  first  tried  to  find  a  controller  for  the  rendezvous 
problem  using  the  fitness  function  described  above;  however, 
all  trials  resulted  in  controllers  in  which  robots  would  form 
a  circle  around  a  target. 

To  solve  this  problem  we  seeded  the  starting  controller 
with  an  aggregation  solution  from  Gauci’s  et  al.  earlier  work 


Figure  3:  Rendezvous  to  dynamic  target. 

[6].  Using  the  seeded  optimization  strategy  we  evolved  the 
following  controller 

V  =  [-0.72,  -1.00, 1.00,  -1.00, 0.99, 1.00] 

which  results  in  the  rendezvous  behavior.  Using  this  con¬ 
troller  robots  closest  to  the  target  move  towards  it  with  a 
slightly  elliptical  path,  while  other  robots  form  clusters  else¬ 
where.  Over  time  the  cluster  of  robots  around  the  target 
pulls  the  extra  clusters  toward  the  target  cluster.  When  the 
clusters  are  close  enough  they  merge  into  a  single  cluster 
around  the  target.  Snapshots  of  the  resulting  behavior  are 
shown  in  Figure  3.  Similar  to  the  perimeter  formation,  this 
behavior  can  also  be  controlled  by  simply  moving  the  target. 

4.2  Foraging 

In  this  behavior  objects  and  robots  are  distributed  ran¬ 
domly  throughout  the  environment  and  the  robots  must 
gather  the  objects  to  a  specified  target  location.  Earlier 
work  by  Gauci  et  al.  found  an  optimal  controller  for  clus¬ 
tering  objects,  we  extend  there  work  by  showing  that  this 
controller  can  be  used  for  foraging  [5].  The  clustering  fit¬ 
ness  function  rewards  global  behaviors  that  minimize  the 
total  distance  between  each  object  and  center  of  the  clus¬ 
ter  of  objects.  Let  oi  (t)  represent  the  position  of  object  i 
at  time  step  t  and  o  (t)  represent  the  center  of  the  object 
cluster.  Then  the  fitness  function  is  given  by 

777,-1 

^ clusteringty )  —  ^  ^  || 0%  (t)  o|| 

i= 0 

where  m  is  the  number  of  objects.  Using  the  clustering 
fitness  function  we  evolved  the  following  controller 

V  =  [0.72, 1.00,  0.40, 0.31, 0.53,  -1.00] 

which  causes  the  robots  to  circle  around  the  objects  and 
slowly  nudge  them  into  a  central  point  as  they  pass. 

The  foraging  behavior  occurs  when  we  place  one  or  several 
fixed  targets  in  the  environment.  Figure  4  shows  the  classic 
foraging  problem  where  there  is  a  “nest”  location  (shown  in 
green)  where  all  of  the  items  must  be  gathered.  Figure  5 
shows  an  alternative  foraging  scheme  where  multiple  sta¬ 
tionary  targets  are  placed  in  the  environment.  The  convex 
hull  of  these  targets  defines  the  region  into  which  the  ob¬ 
jects  will  be  harvested.  Similar  to  the  previous  behaviors, 
the  foraging  behavior  can  be  controlled  simply  by  changing 
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forms  of  stigmergic  control,  and  more  rigorously  explore  the 
space  of  possible  behaviors  given  our  computation-free  as¬ 
sumptions. 
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Figure  4:  Foraging 


(c)  (d) 


Figure  5:  Dynamic  foraging  into  specified  convex 
hull. 


the  location  of  the  targets.  The  robots  will  then  move  the 
items  to  the  new  desired  location,  as  shown  in  Figure  5. 

5.  CONCLUSIONS  AND  FUTURE  WORK 

A  large  amount  of  research  has  been  dedicated  to  devel¬ 
oping  multi-agent  systems  that  perform  complex  behaviors. 
We  show  that  swarms  of  robots  that  can’t  compute  can  per¬ 
form  complex  behaviors  such  as  rendezvous  to  a  desired  lo¬ 
cation,  simple  perimeter  monitoring  of  a  desired  location, 
and  foraging  in  changing  environments.  Our  results  demon¬ 
strate  that  complex  behaviors  can  be  evolved  from  simple 
interactions  between  agents  and  that  these  behaviors  can 
be  controlled  during  execution  by  simply  changing  the  en¬ 
vironment.  We  also  note  that  these  behaviors  are  so  simple 
that  they  could  simply  be  hardwired,  requiring  no  compu¬ 
tational  capabilities.  We  believe  that  this  research  is  an 
important  step  towards  swarm  behaviors  that  can  be  eas¬ 
ily  implemented  in  hardware  and  produced  at  small,  maybe 
even  nano-scale.  In  the  future  we  plan  to  apply  these  be¬ 
haviors  to  actual  robots,  explore  virtual  targets  and  other 
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